Skip to content

Proposed fix for issue #21519: Reshape layer does not handle -1 shape… #21568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

roebel
Copy link
Contributor

@roebel roebel commented Aug 9, 2025

I have implemented a fix that avoids precalculating a statically resolved reshape target in the build method and instead properly calculates the proper shape information for the problematic case when the shape information is not complete in the call method. The method appears to pass the reshape tests for all backends.

Copy link

google-cla bot commented Aug 9, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @roebel, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've implemented a fix for the "Reshape" layer in Keras to correctly handle cases where the "target_shape" includes a "-1" placeholder and the input batch size is "None". Previously, this scenario could lead to incorrect shape calculations. My changes ensure that the target shape is dynamically computed within the "call" method, rather than being statically resolved during "build", addressing issue #21519 and improving the layer's robustness.

Highlights

  • Reshape Layer Initialization: I've added validation to the Reshape layer's "init" method to ensure that the "target_shape" argument does not contain more than one "-1" value, preventing invalid configurations. I also introduced pre-calculated attributes ("need_explicit_shape_for_batch_size_None" and "new_size_no_minus_one") to optimize dynamic shape computation.
  • Dynamic Shape Resolution: I've removed the "build" method's logic for resolving the target shape, as this was causing issues with dynamic batch sizes. Instead, the "call" method now dynamically computes the final "target_shape" when the input batch size is "None" and a "-1" is present in the "target_shape", ensuring correct reshaping for partially defined input shapes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue with the Reshape layer's handling of a -1 dimension in the target shape, particularly when dealing with dynamic batch sizes. The fix correctly moves the shape resolution logic from the build method to the call method, enabling dynamic calculation when input shapes are fully or partially known at runtime. The implementation is logical, but I've pointed out a small piece of leftover code that should be cleaned up to improve maintainability.

@codecov-commenter
Copy link

codecov-commenter commented Aug 9, 2025

Codecov Report

❌ Patch coverage is 78.57143% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.19%. Comparing base (ea62750) to head (1145fb8).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
keras/src/layers/reshaping/reshape.py 78.57% 1 Missing and 2 partials ⚠️

❗ There is a different number of reports uploaded between BASE (ea62750) and HEAD (1145fb8). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (ea62750) HEAD (1145fb8)
keras-torch 1 0
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #21568      +/-   ##
==========================================
- Coverage   82.75%   77.19%   -5.56%     
==========================================
  Files         567      567              
  Lines       56468    56480      +12     
  Branches     8818     8821       +3     
==========================================
- Hits        46730    43601    -3129     
- Misses       7577    10771    +3194     
+ Partials     2161     2108      -53     
Flag Coverage Δ
keras 77.10% <78.57%> (-5.46%) ⬇️
keras-jax 63.77% <50.00%> (-0.02%) ⬇️
keras-numpy 58.28% <50.00%> (-0.01%) ⬇️
keras-openvino 34.66% <7.14%> (-0.01%) ⬇️
keras-tensorflow 64.21% <78.57%> (-0.01%) ⬇️
keras-torch ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

roebel added 2 commits August 10, 2025 01:25
…usively with the case that is not properly handled by ops.reshape. This is when the batch dimension does not have a static shape (shape == None) but all other dimensions have static shape. In that case we can determine the static shape of the dimension containing -1 form all dimensions that are not the batch dimension.
@roebel roebel force-pushed the reshape_fix branch 2 times, most recently from 90a845f to a502050 Compare August 10, 2025 00:02
Copy link
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide a unit test that fails before and passes after?

The Reshape layer can handle of -1 both with static and dynamic dimensions (and dynamic batch size). Instead redoing the work of calculating the dynamically resolved shape, it just lets ops.reshape handle it. No need to duplicated that work.

It does so by having self._resolved_target_shape contain a -1 in it if it could not be statically resolved.

@roebel
Copy link
Contributor Author

roebel commented Aug 12, 2025

You find the test case in the new commit. For the context: When using convolutional models with time series data (voice processing), the time dimension is hardly ever known beforehand. You can probably force a dynamic dimension here, but for convolutional layers this is not required, and for the reshape layer it is not needed either.

Instead redoing the work of calculating the dynamically resolved shape, it just lets ops.reshape handle it. No need to duplicated that work

I may not fully understand this, but I thought that the reshape layer has an advantage compared to the reshape operator, because it knows that the batch dimension stays the same. I wondered in fact why the test case: test_reshape_with_dynamic_batch_size_and_minus_one produces shape (None, 3, 8) and not (None, None, 8). As for example this does:

ops.reshape(keras.KerasTensor((None, 6, 4)), (None, -1, 8))
Out[8]: <KerasTensor shape=(None, None, 8), dtype=float32, sparse=False, ragged=False, name=keras_tensor_3>

@roebel
Copy link
Contributor Author

roebel commented Aug 12, 2025

I don't see any reason why the torch test fails. It runs fine here in my Keras development environment, and it also runs fine with the mini script I shared in the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants